AU«  76  R K BURDICK*  H 0 HARTLEY*  L J RINGER  DAHC04-74-C-0018 
UNCLASSIFIED  TR-17  ARO-8049. 16-M  , NL 


I I 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  flFhan  Dm  tm  Entmrmd) 


READ  INSTRUCTIONS 


BEFORE  COMPLETING  FORM 

3 RECIPIENT'S  CATALOG  NUUSER 


Technical  Report 


A ^UPER-J’OPULATION  APPROACH  TO_MULTI-STAGE 
SAMPLINGS  — - 


\ R.  K. , /Burdick**  I R. 

H.  O./jartley.  \K 

L.  J. /Ringer ( 

ERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Texas  A&M  University 
College  Station,  Texas 


DAH  C^U— 7 k- C-OOl 8 


PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  A WORK  UNIT  NUMBERS 


778^0 


IF  CONTROLLING  OFFICE  NAME  AND  ADDRESS 


if  from  Controlling  Oltlcm) 


IS.  SECURITY  CLASS,  fof  thlm  rmporl) 


Unclassified 


I5».  OECL  ASSIFIC  ATI  ON  'DOWNGRADING 
SCHEDULE 


IS  DISTRIBUTION  STATEMENT  (ol  thlm  Rmporl) 


Approved  for  public  release;  distribution  unlimited 


IT.  DISTRIBUTION  STATEMENT  (oi  the  ebetrect  entered  In  Block  20,  II  different  from  Report) 


18  SUPPLEMENTARY  NOTES 

The  findings  in  this  report  are  not  to  be  construed 
Department  of  the  Army  position,  unless  so  designati 
documents . 


18.  KEY  WORDS  ( Continue  on  rereree  tide  II  neceeeery  end  Identify  by  block  number) 


Statistics 

Population  (statistics 

Estimating 

Sampling 


20.  ABSTRACT  (Continue  on  reveree  elde  II  neceeeery  end  Identify  by  block  number) 


(see  other  side) 


EDITION  OF  I NOV  SS  IS  OBSOLETE 


_ RjRm 

.DOCUMENTATION  PAGE 

4 - - » -Jk.  . 

/ 2.  GOVT  ACCESSION  NO. 

UdJ  AF/f/fl^)+9  .lis-M 

1 

This  report  develops  a new  technique  for  the  estimation  of 
finite  population  parameters  in  a multi-stage  sample  survey. 
Specifically,  estimators  and  confidence  intervals  for  parameters 
of  the  finite  population  are  developed  for  two-stage  sampling  when 


primaries  are  of  either  equal  or  unequal  size,  two-stage  sampling 
when  the  variable  of  interest,  y,  is  related  to  another  variable, 
x,  and  p-stage  sampling  with  an  example  of  three-stage  sampling 
when  units  are  of  equal  size.  The  stochastic  procedure  generating 
the  sample  is  assumed  to  be  a two  step  procedure  where  the  first 
step  is  selection  of  a ’large  sample'*  from  an  infinite  super- 
population and  the  second  step  is  the  actual  implementation  of  the 


1 


f 


f 

i 

!. 

\ 


ARO-D  PROJECT  DAHC04  74  C 0018 


Technical  Report  No.  17 


A SUPER-POPULATION  APPROACH  TO 


MUTLI-STAGE  SAMPLING 


R.  K.  Burdick,  H.  0 


August  1976 


D1STRIBIT ' ' 


Approved  fc:  .■ 

Distribution  Unii 


ABSTRACT 


This  report  develops  a new  technique  for  the  estimation  of 
finite  population  parameters  in  a multi-stage  sample  survey. 
Specifically,  estimators  and  confidence  intervals  for  parameters 
of  the  finite  population  are  developed  for  two-stage  sampling  when 
primaries  are  of  either  equal  or  unequal  size,  two-stage  sampling 
when  the  variable  of  interest,  y,  is  related  to  another  variable, 
x,  and  p-stage  sampling  with  an  example  of  three-stage  sampling 
when  units  are  of  equal  size.  The  stochastic  procedure  generating 
the  sample  is  assumed  to  be  a two  step  procedure  where  the  first 
step  is  selection  of  a "large  sample"  from  an  infinite  super- 
population and  the  second  step  is  the  actual  implementation  of  the 
sample  survey. 
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1.  INTRODUCTION 
1.1  Preliminaries 

In  many  sample  surveys  of  finite  populations,  single-stage 
sampling  designs  are  either  too  costly  or  physically  impractical  to 
use,  and  the  need  for  a multi-stage  design  arises.  In  a multi-stage 
sampling  design,  the  population  is  divided  into  large  units,  called 
primaries,  which  are  further  subdivided  into  smaller  units  known  as 
secondaries.  The  primary  units  are  then  sampled  and  subsanples  of 
the  secondary  units  are  taken  from  the  selected  primaries.  If 
necessary,  the  secondary  units  may  also  be  subsampled  until  a desired 
sampling  element  is  obtained.  Sukhatme  [1947]  offers  an  example  of 
a survey  to  estimate  wheat  production  in  India.  The  district  of 
Moradabad  is  divided  into  six  divisions  and  within  each  division 
a sample  of  eight  villages  is  selected.  XWo  wheat  growing  fields 
are  subsampled  from  each  village  and  a further  subsample  of  plots 
is  chosen  from  each  of  the  selected  fields.  Kish  [1952]  considers  a 
two-stage  sampling  design  of  a city  with  a sample  of  blocks  selected 
in  the  first  stage  and  a sample  of  dwelling  units  taken  from  the 
selected  blocks  in  the  second  stage.  Kish  [1965]  provides  an  example 
of  a three-stage  design  and  gives  an  extensive  list  of  similar  case 
studies . 

Both  Deming  [1950]  and  Kish  [1965]  state  that  multi-stage 
designs  are  often  less  expensive  than  single-stage  designs  due  to  ] 
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the  reduced  cost  in  preparing  sample  frames  and  the  reduction  of 
interviewers'  travel  time.  Whereas  a sampling  frame  of  the  entire 
population  is  needed  for  single-stage  designs,  sampling  frames  in 
a multi-stage  design  are  needed  only  for  the  elements  whose  larger 
units  have  been  selected  at  an  earlier  stage.  The  cost  of  inter- 
viewers' travel  time  is  reduced  since  the  elements  an  interviewer 
must  sample  are  closer  together. 

The  goal  of  any  sample  survey  is  estimation  of  parametric 
functions  of  the  finite  population.  Under  classical  frequentist 
theory,  any  inference  made  concerning  the  population  reflects  the 
expected  behavior  of  repeated  samples  from  the  same  finite  population. 
Under  this  assumption  and  through  the  use  of  conditional  expectations, 
the  theory  of  multi-stage  sampling  has  developed  from  the  results 
of  single-stage  sampling  theory. 

In  a two-stage  design  when  primaries  are  selected  with  equal 
probabilities  and  without  replacement,  Raj  [1968,  p.  114]  and 
Cochran  [1963,  p.  304]  give  an  unbiased  estimator  of  the  population 
total,  the  variance  of  the  estimator,  and  an  estimator  of  the 
variance.  For  the  case  when  primaries  are  selected  with  unequal 
probabilities  and  without  replacement,  Raj  [1968,  p.  118]  provides 
a general  estimator  for  the  population  total  provided  unbiased 
estimators  of  the  primary  total  and  its  variance  exist.  Raj  [1968, 
p.  119]  and  Cochran  [1963,  p.  305]  give  similar  estimators  when 
primaries  are  selected  with  replacement  under  various  subsampling 
schemes.  Other  results  such  as  extension  to  stratified  multi-stage 
sampling  and  estimation  of  ratios  are  found  in  Raj  [1968]  and 
Cochran  [1963]. 
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Unlike  classical  frequentist  theory,  the  approach  in  this 
report  will  assume  that  the  finite  population  is  a sample  of 
size  Mq  from  an  infinite  super-population.  The  actual  implementation 
of  the  survey  is  a process  of  taking  a multi-stage  subsample  from 
the  selected  finite  population.  Inferences  may  be  desired  for  either 
the  parameters  of  the  super-population,  or  for  the  finite  population, 
although  technically  the  "parameters"  of  the  finite  population  are 
now  statistics  summarizing  the  sample  of  size  Mq  drawn  from  the 
super-population.  This  report  will  be  concerned  with  inferences 

regarding  the  "parameters"  of  the  finite  population. 

1.2  Literature  Review 

Before  discussing  the  super-population  model,  it  seems  necessary 
to  first  mention  the  direction  of  recent  research  in  survey  sampling 
theory.  Rao  [1971]  has  stated  that  until  recently,  survey  sampling 
theory  has  evolved  through  an  inductive  process.  Reasonable  estimators 
and  sampling  designs  have  been  suggested  and  their  properties  examined 
either  analytically  or  empirically.  It  has  become  evident  that  some 
general  theory  of  sampling  is  needed  to  better  relate  the  inference 
of  sample  surveys  to  statistical  inference.  Many  recent  attempts 
have  been  made  to  formalize  the  theory  of  sampling  from  finite 
populations,  but  much  confusion  and  controversy  are  associated  with 
these. 

Godarabe  [1955,  1966]  has  introduced  the  notion  of  estimators 
which  are  label  dependent,  i.e.,  not  invariant  to  permutations  of 
the  labels  attached  to  the  units.  He  has  shown  that  in  this  more 
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general  class  of  estimators,  the  customary  optimal  estimators  used 
in  the  classical  theory  lose  their  classical  optimality  properties. 
However,  the  estimators  which  are  superior  to  the  label  invariant 
estimators  depend  on  the  characteristics  of  non-sampled  units  in  the 
population  and  are,  therefore,  not  practically  available.  Many 
other  authors  have  extended  and  refined  Godambe's  original  work 
and  are  referenced  in  Godambe  [1969].  Ericson  [1969a,  1969b]  used 
the  model  suggested  by  Godambe  to  consider  a Bayesian  approach  of 
inference  for  the  finite  population. 

Hartley  and  Rao  [1968,  1969]  and  independently  Royall  [1968] 
have  established  certain  optimality  properties  for  the  customarily 
used  estimators  in  the  basic  designs  within  the  class  of  "scale 
load"  estimators,  i.e.,  estimators  which  are  invariant  to  permutations 
of  the  labels.  Practically  all  estimators  that  have  been  used  in 
practice  belong  to  this  class,  although  Hartley  and  Rao  [1969]  have 
stated  that  they  do  not  exclude  label  dependent  estimators  from 
consideration  and  give  examples  of  instances  when  they  will  be 
useful.  However,  they  have  not  provided  any  general  guidelines  which 
infallibly  indicate  under  what  circumstances  label  dependent  esti- 
mators should  be  used.  They  have  stated  that  a sufficient  condition 
for  the  use  of  label  dependent  estimators  arises  when  some  or  all  of 
the  parameter  functions  in  the  population  to  be  estimated  are  them- 
selves not  invariant  to  label  permutations.  Among  the  results  of 
this  theory  within  the  class  of  "scale  load"  estimators  are  UMV-ness 


of  the  sample  mean  in  random  sampling,  the  Horvitz-Thompson  estimator 
when  sampling  probability  proportional  to  size,  and  the  maximum 
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likelihood  properties  of  an  estimator  similar  to  the  regression 
estimator  when  the  population  mean  of  a concomitant  variable  is 
known. 

Godambe  and  Sprott  [1971]  and  Johnson  and  Smith  [1969]  contain 
excellent  papers  expressing  the  different  viewpoints  concerning 
these  samnling  theories.  The  comments  of  Barnard  [1969],  Rao  [1971], 
and  Godambe  [1970]  also  indicate  the  diversity  of  opinion  on  these 
issues.  Rao  [1973]  provides  an  extensive  bibliography  of  other 
studies  on  this  problem. 

In  using  a super-population  model,  this  report  is 
concerned  with  a new  theory  for  survey  sampling.  By  assuming  the 
finite  population  to  be  a sample  from  an  infinite  super-population 
defined  by  a linear  model,  the  general  theory  of  linear  models  can 
be  applied  to  the  problem  of  estimating  the  finite  population 
parameters . 

1.3  Super-Population  Models 

The  use  of  a super-population  model  is  not  a new  concept  in 
the  statistical  literature.  One  of  the  earliest  to  explicitly 
state  the  super-population  model  was  Cochran  [1939]  when  he  con- 
sidered estimation  of  the  finite  population  mean  for  simple  random 
and  stratified  sampling  designs.  Cochran  [1946]  again  used  the 
model  to  compare  systematic  and  stratified  samples  from  populations 
where  the  variance  within  a group  of  elements  increases  as  the  group 
size  increases.  Raj  [1958]  regarded  the  finite  population  as  a 
random  sample  from  an  infinite  super-population  to  compare  a 


probability  proportional  to  size  estimator  with  the  simple  average, 
ratio,  regression,  and  stratified  sample  estimators.  Royall  [1970] 
used  such  a model  to  develop  optimal  sampling  plans  for  estimating 
a finite  population  total.  Many  other  examples  exist  and  the 
interested  reader  is  referred  to  Fuller  [1973]  and  Rao  [1973]  for 
additional  references. 

The  notion  of  an  underlying  super-population  also  accompanied 
the  introduction  of  analytical  surveys.  Deming  [1950,  Ch.  7]  and 
Cochran  [1963,  p.  37]  state  that  when  comparing  domain  means  in  an 
analytical  survey,  the  null  hypothesis  is  that  the  domains  have  been 
drawn  from  the  same  infinite  population.  Sedransk  [1965]  also 
expresses  the  vifv  that  inferences  refer  to  a more  "general"  popula- 
tion than  the  existing  finite  population.  Konijn  [1962]  made  similar 
assumptions  and  considered  estimators  for  functions  of  the  super- 
population parameters.  Fuller  [1973]  gives  results  for  the  estimation 
of  parameters  of  the  infinite  population  in  a two-stage  sampling 
design. 

Of  more  particular  interest  in  this  report  are  the  papers 
concerned  with  estimation  of  the  finite  population  parameters  as 
opposed  to  the  estimation  of  the  super-population  parameters.  Royall 
and  Herson  [1973a,  1973b]  assumed  a super-population  model  in 
estimating  parameters  of  the  finite  population  for  single-stage 
designs  and  examined  results  when  the  assumed  model  broke  down. 

Hartley  and  Sielken  [1975]  considered  a more  general  case  than  Royall 
and  Herson  where  auxiliary  variables  are  not  fixed  in  the  super- 
population. Scott  and  Smith  [1969]  assumed  a super-population  model 
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when  using  a Bayesian  approach  to  derive  estimators  of  linear  functions 
of  finite  population  parameters  in  two-stage  sampling. 

The  preceding  discussion  may  best  be  summarized  by  Table  1 
reproduced  from  Hartley  and  Sielken  [1975].  In  regard  to  multi-stage 
sampling,  the  standard  results  discussed  in  Section  1.1  are  classified 
as  Case  1.  Analytical  surveys  and  similar  studies  concerned  with 
estimating  parameters  of  the  infinite  population  are  Case  3.  The 
papers  of  Hartley  and  Sielken  [1975],  Royall  and  Herson  [1973a,  1973b], 
and  Scott  and  Smith  [1969]  are  concerned  with  Case  2. 

TABLE  1 

Sampling  Theories  Classified  by  Sampling  Procedure 
and  Target  Parameters 


Target  Parameters 



Sampling  Procedure 

Repeated  sampling  from  a 
fixed  finite  population 

Repeated  two-step  sampling 
from  an  infinite  population 

Parameters  of 

Classical  finite  popula- 

Super-population  theory  for 

finite  population 

tion  sampling  theory 

finite  population  sampling 

= Case  1 

= Case  2 

Parameters  of 

Infeasible 

Inference  on  infinite  popu- 

infinite 

lation  parameters  from  two- 

super-population 

step  sampling  procedure 
= Case  3 

A. 
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This  report  considers  only  Case  2.  In  particular,  this 
report  takes  a non-Bayesian  approach  to  the  problem  considered 
by  Scott  and  Smith  [1969]  and  serves  as  an  extension  to  the  work, 
done  by  Hartley  and  Sielken  [1975]. 


1.4  Overview 


In  this  section  the  problems  to  be  considered  in  this  re- 
port are  briefly  sketched,  and  their  relationship  to  the  work  of 
Hartley  and  Sielken  [1975]  is  discussed. 

Hartley  and  Sielken  assume  that  the  current  finite  population 
is  a random  sample  from  a super-population  of  the  form 


y = xT£  + c [v(x)  I*5 


where  x is  a vector  of  variables,  g is  a (p*l)  vector  of  unknown 
constants,  e is  a normal  random  variable  independently  distributed 
of  x,  and  v(x)  is  any  known  function  of  x.  The  basic  parameters  of 
interest  for  the  current  finite  population  are 


b = (XTV  -4)  1XTV  4 


T 

and  linear  combinations  of  b^,  say  c^  b.  Once  a sample  survey  of  the 
finite  population  is  conducted,  the  population  quantities  Y and  X 
can  be  partitioned  into 


rtf 


i 


is  such  that  for  any  £ 


is  a 100(l-o)%  "confidence  interval"  on  c b where 


If  X is  unknown 


No  assumptions  on  the  distributions  of  X and  X are  made  except  that 


is  of  full  rank  with 


probability  one. 

In  Section  2 and  Section  3 the  super-population  model  for  two- 
stage  sampling  is  assumed  to  be 


y 


ij 


P + O.  + £ . , 

i iJ 


(1.9) 


where  y refers  to  the  j-th  observation  in  the  i-th  primary  and  the 

or  and  e are  independently  normally  distributed  with  means  0 and 
2 2 

variances  and  respectively.  The  finite  population  parameter  of 
interest  is  the  population  mean  Y.  If  the  super-population  model  (1.9) 
is  rewritten  as 


y 


ij 


a*  + e 


ij 


(1.10) 


where  a*  = p + a^,  then  conditioning  upon  the  a*’s  and  letting 

T 

B.  = [a*  a*  ...]  gives 


. . . ] . 


(1.11) 


Then  c b = Y when 


/ - [Mj/M,  M2/Mo 


(1.12) 


where  is  the  size  of  the  i^  primary  and  is  the  total  number  of 
elements  in  the  finite  population.  The  results  of  Hartley  ana  Sielken 
imply  an  unbiased  estimator  and  a confidence  interval  for  Y based  on 
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[y,  y->  •••] 


(1.13) 


only  if  Xg  is  of  full  rank,  i.e.,  only  if  every  primary  is  sampled  at 
least  once.  Since  not  all  primaries  are  sampled  in  a multi-stage 
design,  the  results  of  Hartley  and  Sielken  do  not  apply  when  the 
super-population  is  considered  in  the  form  (1.10).  To  alleviate 
this  problem,  the  super-population  model  (1.9)  is  rewritten  as 


yij  = M + nij 


(1.14) 


where  + e^.  Then,  if  the  primaries  are  of  equal  size, 

b = Y,  and  the  results  of  Hartley  and  Sielken  will  imply  an  unbiased 
- 2 2 

estimator  of  Y.  Furthermore,  if  is  known,  the  results  of 

Hartley  and  Sielken  will  also  imply  an  exact  confidence  interval  on 

Y since  then  X,  X , V,  and  V are  all  known.  In  Section  2,  two-stage 
s s 

2 2 

sampling  in  which  the  primaries  are  of  equal  size  but  acJaz  Is 

unknown  is  considered.  In  Section  3,  two-stage  sampling  is  considered 

when  primaries  are  not  of  equal  size  and  consequently  there  does  not 

T 

exist  a c:  such  that  c^  b = Y. 

In  Section  4,  the  super-population  for  two-stage  sampling  is 
assumed  to  be 


yij ' " + \ + Bxu  + 


(1.15) 


where  is  a variable  related  to  y and  8 is  a constant.  The  finite 
population  parameter  of  interest  is  still  Y.  If  the  super-population 
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model  (1.15)  is  rewritten  as 


[1  xi j 1 |e 


(1.16) 


where  a*  = p + a^,  then  as  with  (1.10),  the  results  of  Hartley  and 
Sielken  do  not  apply  to  multi-stage  sampling  since  not  all  primaries 
are  sampled  and  consequently  Xg  is  not  of  full  rank.  Furthermore, 
even  if  the  super-population  model  (1.15)  is  rewritten  as 


[1  xii]U  + ni 


(1.17) 


where  + e^,  the  results  of  Hartley  and  Sielken  do  not  apply 

unless  all  primaries  are  of  equal  size  so  that  there  exists  a c such 
T 

that  £ b = Y.  In  addition,  to  construct  a confidence  interval  on  Y, 

2 2 

o /o  must  be  known, 
a e 

Finally,  in  Section  5,  a general  methodology  for  a p-stage 
sampling  design  is  discussed  and  the  results  for  a three-stage  design 
with  primaries  of  equal  size  and  secondaries  of  equal  size  are  given. 


N 

L 
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2.  TWO-STAGE  SAMPLING  WITH  EQUAL 
SIZES  AND  SAMPLES 


2.1  Estimation  and  Variance  Formulas  for 
the  Finite  Population  Mean 

The  first  model  considered  is  a two-stage  sampling  design  when 
all  primaries  have  an  equal  number  of  elements  and  an  equal  number 
of  secondaries  are  sampled  from  each  primary.  The  notation  adopted 
is  that  of  Cochran  [1963]  in  which 

N = number  of  primaries  in  finite  population, 

M = number  of  secondaries  per  primary, 
n = number  of  sampled  primaries,  and 

m = number  of  sampled  secondaries  per  sampled  primary. 


The  linear  model  describing  the  super-population  is 


where 


'ij 


= p + a± 

^ N(0,  a2) 
a 


+ 


e 


ij 


(2.1) 

(2.2) 


e±j  * N(0,  op  , 


(2.3) 


and  all  and  e„  are  independent.  This  linear  model  may  also  be 
expressed  as 


yu  ’ u + "ij 


(2.4) 


/ 
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where  the  are  Independent  normal  random  variables  with  mean 

zero  and 


E(niJ  V>  ■ °a  + ’c  • 

i=k,  j-i  , 

2 

l 

i“k,  , 

i 1 

i^k  . 

(2.5) 

The  finite  population  of  size  MN  is  represented  by 


1 p + H 


(2.6) 


where  Y is  the  (MN*1)  vector  of  finite  population  observations,  _1  is 
a (MNxl)  vector  of  ones,  and  H is  a (MN*1)  vector  of  random  variables 


The  covariance  matrix  of  H is  the  (MNxMN)  block  diagonal  matrix 


with  I denoting  the  identity  matrix  of  order  M,  J denoting  the 


* r 


*<l> 
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in  which  Y^  is  a (mnxl)  vector  of  sampled  observations,  Y is  a 

((MN-mn)xl)  vector  of  the  unobserved  population  elements,  1 is  a 

8 

(mn*l)  vector  of  ones,  and  ^ is  a ((MN-mn)*l)  vector  of  ones.  The 

estimator  analogous  to  (2.10)  but  based  on  only  the  sample  quantities 

is 


where 


and 


(lV1!  )_1lV1Y 

~s  S — S — S 8 — S 


nm 

ZZy 


ij 


nm 


(2. 14) 


(2.15) 


I 

m 


_£ 

(l+m  p) 


i = 1,  . ...  n . (2.16) 


Since 


is 


both  Y and  Yg  are  unbiased  estimators  of  p,  E(Y  - Y ) - 0,  and 
a natural  estimator  for  Y. 

Using  the  results  of  Hartley  and  Sielken  [1975],  the  variance  of 


(Y  ~ Y£)  is  given  by 


17 


- - T -1  -1  T -1  -1 

V(Y  - Y ) «(1V  1 ) - (IV  Xl) 

E — s s — s — — 


2 2 
a o 

— (1  - -)  + — (1  - — ) 
n Vi  N;  mn  V MN; 


2.2  Confidence  Intervals  on  Y 


(2.17) 


Since  for  the  super-population  both  Y and Y are  statistics,  a 

£ 

100(l-a)%  "confidence  interval"  on  Y will  be  interpreted  to  mean 


Prob  (Ye [Lower  bound,  Upper  bound])  = 1 - a . 


(2.18) 


Under  the  assumption  that  the  ratio  of  variances  p and  hence  V 
is  known,  an  exact  100(1  -a )%  confidence  interval  on  Y corresponding 
to  Hartley  and  Sielken  is 


§ ± a t , / Hm  _ I±*E 

_ E s a/2;mn-lv  nm  MN 


(2.19) 


where 


1)0  •=  ZE(y 

8 ij 


■ y)2  ■ 0?m(yi  - y)2  * 


(2.20) 


i 1,  . . . , n , 


(2.21) 


(2.22) 


> ..  i 
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However,  since  p is  rarely  known  in  practice,  two  confidence  intervals 
which  do  not  require  its  knowledge  are  developed  in  this  section. 

One  method  for  constructing  an  approximate  confidence  interval 
on  Y is  based  on  the  fact  that  V(Y  - Y ) in  (2.17)  is  a linear 
combination  of  variance  components.  To  set  a confidence  interval 
on  Y,  some  well  known  results  from  variance  components  analysis  are 
stated  as  theorems  (see,  e.g.,  Graybill  [1961]). 


21 


1 


To  show  this,  notice  that 


n M-m 


N-nM 


(Y  - Ye) 


— {(mn  - MN)y  + £ £ y + £ Zy  } 

i J  2  3 i j 3 


(2.41) 


n M-m 


where  £ £ y represents  the  sum  of  those  non-sanpled  elements  whose 


i j 


N-n  M 


primaries  were  selected  and  £ £ y represents  the  sum  of  those  non 


i j 


sampled  elements  whose  primaries  were  not  selected. 


Since  g is  a function  of  the  sampled  elements,  and  C°v(y ^ =0 

N-n  M n M-m 

for  ii^k,  then  g is  independent  of  £ £y^.  To  show  £ £ y^  4 is 


i J 


i j 


’a 


2 

independent  of  g,  it  must  be  shown  to  be  independent  of  both  s^  and 
2 

s . Defining 


w 


m 

zy. 


- 1 rj 
V » 


i. 


(2.42) 


n M-m 

it  can  be  seen  that  Cov(£  £ y. .,  y - y)  = 0.  Since  the  y arc 


i j 


ij  r 


ij 


n M-m 

normal  random  variables,  £ £ y must  be  independent  of  any  coiabina- 


i i 


tion  of  (y,  - y,  ...,  y - y),  and  therefore  independent  of  s*. 

in  b 


n M-m 


n M-m 


Likewise,  Cov(£  £ y.,,y.-y)=0  and  £ £ y^  is  independent  of 
i j rj  r i j 


2 - 2 
s^.  Finally,  it  must  be  shown  that  y is  independent  of  both  s^  and 


. Observe  that  Cov(y, , y ) = 0 for  i^u,  and  the  are  therefore 
w i u i 


distributed  as  normal  random  variables  with  mean  p and  variance 


2 °c 

+ — . A well  known  result  is  that  y is  stochastically  independent 


of  (yx  - y.  •••»  y„  - y)  (see,  e.g.,  Hogg  and  Craig  [1970,  p.  163]). 

Hence,  y is  independent  of  any  combination  of  (y  - - >), 

1 n 

2 _ 

and  therefore  independent  of  s^.  Now  observe  that  Cov(y^  - y^,  y . ) = ( 

and  hence  (y^.  - y^)  is  independent  of  y^.  Therefore,  y = Ey^/n  i3 

_ o ^ 

independent  of  (y  . - y . ) and  likewise  s . 

ij  i w 

A A 

Since  (Y  - Y^)  and  n'g/V(Y  - Y ) are  independent,  an  approximate 
100(l-a)%  confidence  interval  on  Y is 


[YE  * tci/2;n'  ^ 1 * 


(2.43 


Due  to  the  particular  form  of  V(Y  - Y ),  an  exact  confidence 

E 

interval  on  Y not  previously  discussed  in  the  literature  can  by 
developed  by  considering  contrasts  of  y.^.  Let 


(2.44 


(2.45 


(2.46 


(2.47 


(2.48 


and  the  fc^.'s,  c^,  C2»  and  c^  are  constants.  Assuming  the  model 


In  (2.1)  - (2.3). 


dj  ^ N (0,  a^Cj)  , 


(2.49) 


and 


y,  * N(y,  o2  + - a2)  , (2.50) 

l a m e 


u ^ N(c  m,  c2(a2  + z,  °2)  + c2c  a2)  (2.51) 

l I lame  2.  j e 


as  and  are  Independent.  The  independence  of  y^  and  d^  can  be 


t • . * 


Hence,  if  it  can  be  shown  that  (Y  - Y ) is  independent  of 

E 

(n  - l)g  /V(Y  - Y ) , then 
e t. 


As  before. 


n M-m 


N-n  M 


Y - Y„  = 


E = MN{(mn  " MN)y  + E 1 yii  + 1 Zyii} 

i j i j 


N-n  M 


and  E E y . . is  independent  of  g which  is  a function  of  sample 

i j J C 


elements  only.  Now, 


Ui  “ U = Cl(yi  " y)  + 


nm 


zu.  .y. . 
ii  ij  13 


n M-m 


and  using  the  results  stated  earlier  in  this  section,  E L y . an! 

i j J 


are  both  independent  of  (y . - y)  and  Er.Cy,  , - y , ) = A 

1 ■ ij  ij  i j i]  ij 


exact  100(l-a)%  confidence  interval  on  Y is  therefore 


[YE  f ta/2;n-l  /ge  1 


A numerical  example  of  this  procedure  is  given  in  Appendix  A. 

It  should  be  noted  that  this  is  an  exact  confidence  interval 


m 


for  any  choice  of  «...  as  long  as  Ed.  . = 0 and  Ed,  , = c~  for  all  i 
ij  j ij  j ij  3 


Since  the  length  of  the  confidence  interval  is  determined  by  the 


value  of  /g  , It  would  seem  to  be  important  to  minimize  this 
quantity  when  selecting  the  1 and  the  corresponding  value  for  c^« 
However,  since  the  distribution  of  gg  does  not  depend  on  ^ , any 
convenient  set  of  1 may  be  used.  For  example,  if  m is  even,  let 


— » j = •••»  2 » 

m 

” » j * "2  • » • » m , 


(2.65) 


and,  if  m is  odd,  let 


o = -1  i = l 

‘■y  J ■*-»•••»  £ * 

= 0 , j = ^ , 


= +1  , j = 


(2.66) 


for  all  i.  The  robustness  of  the  confidence  interval  to  model 
breakdown  may  however  depend  on  the  S,^,  an<^  this  problem  is  discussed 
in  Section  2.5. 


2.3  Comparison  of  Confidence  Intervals 


The  confidence  intervals  in  (2.43)  and  (2.64)  are  now  compared. 
Of  course  if  p is  known,  (2.19)  provides  an  exact  confidence  interval 
with  mn-1  degrees  of  freedom  and  would  be  superior  to  either  one. 
Disregarding  any  consideration  of  the  "goodness"  of  the  approximation 
in  (2.43),  the  criterion  for  comparison  is  the  degrees  of  freedom 
associated  with  the  t-stati9tic. 

The  degrees  of  freedom  in  the  exact  t-statistic  of  (2.64)  are 
n-1  and  in  (2.43)  they  are  n'  where  n*  is  defined  in  (2.37).  After 
some  algebraic  simplifications. 
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n’  = (n  - 1)K 

where 

(1+d. 6) 2 

K , 

u+d  <n 


d 


i 


_£ 

(1-p) 


(1-q) 


= NP-1 

2 Np(Mq-l)  ’ 


(2.67) 


(2.68) 


(2.69) 


(2.70) 


(2.71) 


and 


(2.72) 


(2.73) 


Whenever  K is  greater  than  one,  the  expected  length  of  the  approximate 
confidence  interval  will  be  less  than  the  expected  length  of  the  exact 
confidence  interval.  Notice  that  K is  always  greater  than  one  for 
6c [0,  1].  This  can  be  seen  by  noting  that  K(0)  = 1,  K(l)  > 1,  and 
that  the  only  possible  inflection  point  for  6 in  [0,  1]  is  a maximum 
i.e.,  the  only  possible  6 in  [0,  1]  such  that  K'(6)  = 0 also  has 
K''(6)  < 0.  Table  2 gives  the  values  of  K for  selected  values  of  6, 
p,  and  q,  in  a population  with  N = 20  and  M = 100. 


r-  * 
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TABLE  2 


Values 

of  K with 

N = 20 

and 

M = 100 

p 

g. 

.05 

.10 

A 

.50 

1.00 

.2 

.2 

1.020 

1.040 

1.210 

1.438 

.2 

.4 

1.015 

1.030 

1.155 

1.322 

.2 

.6 

1.010 

1.020 

1.102 

1.210 

.2 

.8 

1.005 

1.010 

1.051 

1.102 

.4 

.2 

1.054 

1.109 

1.599 

2.321 

.4 

.4 

1.040 

1.082 

1.439 

1.953 

.4 

.6 

1.027 

1.054 

1.284 

1.603 

.4 

.8 

1.013 

1.027 

1.138 

1.284 

.6 

.2 

1.123 

1.254 

2.516 

4.526 

.6 

.4 

1.092 

1.188 

2.093 

3.543 

.6 

.6 

1.061 

1.124 

1.688 

2.546 

.6 

.8 

1.030 

1.061 

1.322 

1.688 

.8 

.2 

1.344 

1.734 

6.002 

11.719 

.8 

.4 

1.254 

1.535 

4.678 

10.154 

.8 

.6 

1.166 

1.345 

3.207 

6.496 

.8 

.8 

1.082 

1.166 

1.956 

3.216 
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Although  the  degrees  of  freedom  are  less  for  the  exact  con- 
fidence interval  than  for  the  approximate  interval,  when  6 is 
small,  i.e.,  when  the  between  primary  variation  is  larger  than 
the  within  primary  variation,  K.  is  close  to  one  and  the  degrees 
of  freedom  are  nearly  the  same.  In  most  survey  populations,  6 is 
in  fact  small,  and  use  of  the  exact  confidence  interval  seems 
advantageous.  Even  when  6 and  K are  large,  the  t-values  associated 
with  the  two  intervals  will  not  vary  greatly  if  n is  large,  and  the 
exact  interval  again  seems  appropriate. 

2.4  Estimation  of  the  Finite  Population  Total 

NM 

The  corresponding  results  for  the  population  total  Y = ZEy 

ij  3 

follow  immediately.  Selecting  the  estimator 


uu  nm 
ye  = S 

E mn  ij 


(2.74) 


implies  E(Y  - Y ) = 0 and 
E 


i 

i'  J ' 


2 2 

V(Y-Ye)  =M2N2{f(l-£)+^(l-f)> 


(2.75) 


The  unbiased  estimator  of  V(Y  - Y ) is 

b 


M2N2  n.  2 , m.  2 

8 * -^r(1  - s)sb  + -ir(1  “ M)sw 


(2.76) 


The  approximate  100(l-a)%  confidence  interval  on  Y is 


A 


« 


• T'  * 


fES 
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fYE  4 ta/2; n*  ^ 1 


(2.77) 


where  g is  as  defined  in  (2.76).  The  exact  100(l-a)%  confidence 
interval  on  Y is 


[YE  1 *0/2;  n-1^5 


(2.78) 


where  is  identical  to  (2.59)  except  that 


2 M2!)2  ,N-n, 

c.  = -(~^~) 

1 n N 


(2.79) 


2 MN . . 

c~c  = — (M  - m) 
2 J m 


(2.80) 


2.5  Robustness  to  Model  Breakdown 


The  results  of  Section  2.1  and  Section  2.2  were  developed 

under  the  assumptions  stated  in  (2.1)  - (2.3).  This  section  exandnes 

the  robustness  of  Y and  the  exact  confidence  interval  to  a breakdown 
E 

in  the  assumed  model.  Notice  that  if  primaries  are  drawn  without 
replacement  and  secondaries  are  sampled  independently  within  each 
primary,  Y^  is  the  classical  unbiased  estimator  for  Y.  Therefore, 


E(Y  - Y I any  finite  population)  = 0 , 


v2. 81) 


it  follows  that  E(Y  - Yf,)  = 0 no  matter  what  is  assumed  about  the 
sup er-popul at ion . 


Similarly,  the  estimator  g in  (2.35)  is  the  classical  unbiased 


estimator  for  V(Y  ) if  primaries  are  selected  without  replacement 
h 

and  secondaries  are  randomly  selected  within  each  primary.  Hence, 
E(g  - V(Y  - Y^))  = 0 for  any  assumed  super-population  model. 

The  robustness  of  the  exact  confidence  interval  in  (2.64)  to 
model  breakdown  must  rely  on  the  robustness  of  the  t-statistic. 
However,  for  a specific  model  breakdown,  it  may  be  possible  to 
select  the  2.^ ^ ' s so  that  the  consequences  are  minimized. 

As  an  example,  consider  the  situation  where  the  e^'s  have  a 
non-normal  distribution  with 


and 


A1 though 


E<£lV 


0 , 
2 


= a 


E(eiV 


= e , 


E(e4.) 

ij 


u = c v + c d 
i lyi  21 


(2. 


(2. 


is  now  non-normal,  an  appropriate  choice  of  the  i^'s  can  minimize 
the  non-normality.  Since  the  A^'s  affect  u^  only  through  the 
variable  d^,  they  should  be  chosen  so  that  d^  behaves  as  a normel 
random  variable.  For  the  i^  primary,  let 


mrm 


and 


32 


(2.88) 


From  (2.82)  and  (2.83),  it  follows  that 


E(C.)  = 0 

J 


and 


(2.89) 


2 2 

v(e.)  - a .of 

3 i]  £ 


(2.90) 


Notice  that  if  | i | is  equal  for  all  j,  the  £j's  are  independent 
random  variables  with  equal  first  and  second  moments.  The  's 
then  satisfy  the  Lindeberg  condition,  and  d^  has  a limiting 
distribution  which  is  normal  (see,  e.g.,  Gnedenko  [1963,  p.  290]). 
Hence,  if  the  Jl^.'s  are  selected  such  that  | | is  equal  for  all  j, 

the  non-normality  of  u is  reduced. 

It  also  seems  desirable  to  choose  the  £.  's  so  that  the  third 
and  fourth  central  moments  of  d^  correspond  to  those  of  a normal 
random  variable.  This  implies  setting 


E(dp  = 0 


(2.91) 


and 


E(dJ)  = 3(E(d2))2 


(2.92) 


ra 

Keeping  Zl . . 

j lj 

that 


m 


0,  EiL  . = Cy  and  using  (2.82) 

j LJ 


(2.85),  it  follow:; 


. « fcj-  


*• 


<2.93 


E(di'  * 'V'V  ' 0 • 


2 2 
E(d1)  = 0£c3  , 


(2.94; 


and 


E(dJ) 


E(d*) 


m 

= dXl 


ij  * 


m 4 4 m 2 2 

uiZC.  + 6 a Z 


j 


ij 


'j<j’ 


ij  ij 


(2.95: 


(2.96: 


Therefore,  satisfying  (2.91)  and  (2.92)  implies 


“u  - 0 


(2.9  7: 


and 


Zl. 


ij 


- 1 


3 a 


= 0 


1 -98) 


If  | i,„  | is  equal  for  all  j,  (2.97)  is  satisfied,  and  for  l^jl  # 0, 

(2.98)  is  most  closely  satisfied  with  small  values  of  £, . Hence, 

to  reduce  the  non-normality  of  u^  when  e. . is  non-normal,  select 

m 4 

the  i.  ,'s  so  that  li. .1  is  equal  for  all  i,  and  Zi . , is  small, 
ij  ij'  J j ij 

As  another  example,  assume  that  the  are  normally  distributed 

2 2 

but  that  V(e. .)  =o.,  where  a.  is  not  the  same  for  all  i.  Under 
ij  i i 

such  a model. 


, 22  2.C1  . 2 - 

V ( u . ) =cta  + a. ( — + c.c.) 
i la  i m 2 3 


(2.99: 
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in  no  longer  equal  for  all  1.  However,  If  the  values  of  are 

known  or  can  be  estimated,  different  values  of  c^c_  can  be  chosen 

2 2!  1 
c*~ 

2 1 l 

for  each  primary  so  chat  o (—  + (r  c_).)  will  be  equal  for  all 

1 ID  2 J 1 

primaries . 

Since  no  choice  of  the  i^'s  *-s  t>est  for  every  possible  model 
breakdown,  the  £ 's  should  be  chosen  to  protect  against  the  breakdown 


that  is  most  likely  to  occur  in  a particular  situation. 

If 


I 
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3.  TWO-STAGE  SAMPLING  WITH  PRIMARIES 
OF  UNEQUAL  SIZE 

3.1  Definition  of  the  Model 

A two-stage  sanpling  design  will  now  be  considered  for  the 
case  when  primaries  are  not  of  equal  size.  Let 


N = number  of  primaries  in  finite  population, 
n = number  of  primaries  sampled, 

M = size  of  i*^1  primary, 

m.  = number  of  secondaries  selected  from  the  i*^  primary, 
1 N 

M = EM.  = total  elements  in  population,  and 
i 
n 

m = Ini,  = total  elements  in  sample, 
o . i ^ 

l 


It  is  assumed  that  all  of  these  variables  are  known.  The  super- 
population is  again  represented  as 


yij 


p + a.  + c . . 
i 13 


(3.1) 


where 


ai  ^ N(0,  o^)  , 


Gij  ^ N (0 , a£)  , 


(3.2) 

(3.3) 


and  a.  and  e..  are  independent.  As  in  Section  2.1,  redefine  (3.1) 
1 !3 


as 


yi3  = P + nij 


(3.4) 
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where  the  q^'s  are  independent  normal  random  variables  with  mean 
zero  and 


E(n.  . n,  „)  = o2  + a2  , 
13  kl  a e 


i=k,  j=«.  , 


i=k,  jl4*-  » 


= 0 , 


i^k  . 


The  finite  population  is  represented  as 


Y = lp  + H 


(3.6) 


where  Y is  the  (M^xl)  vector  of  finite  population  observations,  1 

is  a (M  xl)  vector  of  ones,  and  H is  a (M  xl)  vector  of  random 
o — o 

variables.  The  covariance  matrix  for  H is  now  the  (M  xM  ) block 

— 00 

diagonal  matrix 


2„  2 

o V = a 
£ £ 


(3.7) 


0 0 ...  V.. 


where 


vi  = V + pJM,  • 
i i 


i = 1,  . . . , N , 


(3.8) 


with  Iw  denoting  the  identity  matrix  of  order  M. , J denoting  the 

M.  in, 

1 i 

2 2 

(M.xM.)  matrix  with  all  elements  equal  to  one,  and  p = o /a  . 

11  a e 


Recall  that  In  Section  2.1,  the  finite  population  mean  was 


expressible  as  the  BLUE  of  p.  However,  in  the  present  situation 
where  primaries  are  of  unequal  size,  the  BLUE  of  p is 


U = 


T -1  -IT  -1 
(I  V 1)  TV  Y 


N Mi 


i j ^ 1+Mip 


(3.9) 


2M,  (tt^TT) 


" i'l-H^p' 


Since  there  is  no  linear  transformation  of  p that  equals  Y,  p does 
not  now  readily  suggest  an  estimator  of  Y.  For  this  reason,  a 
different  method  for  the  estimation  of  Y is  introduced. 


3.2  Estimation  and  Variance  Formulas  for  the  Finite  Population  Mean 


Express  the  finite  population  in  terms  of  the  least  squares 


fit 


yij  = "i  + 6ij  ’ 


i = 1, 
j = 1. 


...  N , 


...  M.  , 


(3.10) 


where 


M 


i 

l y 


ai  ~ ""Si 


U 


(3.11) 


N Mi 


mi  nimi 


zes  Z Z (y  - a^)  . The  finite  population  mean  is 

i j 


■--r-  assssi  mmesL 
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m 


N 

“A 

~ M 


N 

JVi 

M 


(3.12) 


Obviously,  an  estimator  of  Y such  that  E(Y  - Y)  = 0 is 


N M.  A 

5U  = * iT  ai 

i o 


(3.13) 


where  E(ai  - a^)  = 0. 


For  the  primaries  selected  in  the  sample. 


m. 


. ? yij 

a = 3—1  = y 

i m,  J i 


(3.14) 


is  the  classical  estimator  of  a^.  For  the  primaries  of  the  finite 
population  not  represented  in  the  sample,  the  only  knowledge  about 
them  is  that  the  a^  have  been  selected  from  a population  with  mean  p. 

A logical  estimator  for  a±  is  therefore  the  BLUE  of  y.  The  BLUE  of 
U computed  from  the  sample  is 


P_  = 


(3.15) 


where 
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v(y±) 


2 2 

o +a  /m, 
a e i 


(3.16) 


Unfortunately,  and  a £ are  seldom  known  in  practice  and  V(y^) 
cannot  be  computed.  Koch  [1967]  has  considered  three  other  unbiased 
estimators  of  p for  the  model  assumed  in  (3.1)  - (3.3).  These 
three  estimators  are 


and 


n 

fiVi  ’ 


yb = 


JVi 


where 


ra  -m. 
o i 


2 r 2 

m -Em, 

0 i 1 


(3.17) 


(3.18) 


(3.19) 


(3.20) 


Of  course  when  all  m^  are  equal,  all  three  estimators  are  the  sane. 

To  compare  the  three  estimators  it  is  appropriate  to  examine 
their  variances.  It  can  be  seen  that 
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r r . w. 


I 


V(ya)  = (Zk^)o^  + 

i i 


= Va  + Ve  * 


(3.21) 


v(yb) 


i 2 . 1,  2 

—7-  o + ( — )o 
2 a m e 

m o 

o 


’ Va  + B2 4 ' 


(3.22) 


V(yc) 


.1.  2 . 1 “ 1 2 

(-)a  + — E — a 

n a 2 , m.  e 
nii 


Va  + Vc  ' 


(3.23) 


Koch  presents  a table  with  n = 5 and  for  various  values  of  m. 


(1=1,  ...»  5),  shows  the  following  inequalities  hold: 


C1  i *1  1 


(3. 24 1 


for  the  coefficients  of  o , and 

a 


B2  — A2  — C2 


(3.25) 


for  the  coefficients  of  o^. . The  solution  to  which  of  the  three 


estimators  has  the  smallest  variance  is  therefore  dependent  on  the 


2 2 

actual  values  of  a and  a . Since  y is  the  intermediate  estimator 
a 1 a 
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2 2 

in  regard  to  the  coefficients  of  and  o^,  it  will  be  chosen  as 
the  estimator  of  u for  purposes  of  illustration.  Koch  further 
shows  that  y is  optimal  with  respect  to  a particular  quadratic, 
location-sensitive  criterion. 

Substituting  (3.14)  and  (3.17)  into  (3.13)  gives 


YU  = ^r  + (1  " w)ya 


(3.26) 


where 


n 

“i 

l 

M 


(3.27) 


and 


fiyi 

n 

EM. 


(3.23) 


Notice  that  is  the  classical  ratio  estimator  of  Y for  two-stage 
sampling  when  units  are  selected  with  equal  probabilities  (see,  e.g., 
Cochran  [1963,  p.  300]). 

The  variance  of  (Y  - Y^)  is  computed  by  writing 


. (N-n  n ... 

Y " Y„  = m 1 E Y-  + £lY-  - y.  (—  + M 
II  M . i .1  im,  o 
o [ l i l 


M. 

+ M_(l  - u)k  ) ] (3.29) 


where 


M. 

Yi  = ? yij  ' 
j 


(3.30) 
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V(Y  - Yy) 


2 n Z H, 
0 , 1 

o 


2 zm; 
a , i 

e n l . 

+ ~ ' w z m + (1 

mn  M M 
o o 


Notice  if  all  M. 

l 


= M,  (3.37)  reduces  to 


V(Y  " “ K' > + d(1  " ’ 

U n N mn  MN 


which  is  the  same  result  as  (2.17)  in  Section  2.1. 

A comment  should  be  made  about  the  preceding  results  and  those 
to  follow  in  section  4.  When  primaries  are  of  unequal  size,  even 
though  M^,  ...»  are  fixed  and  assumed  known,  in  a strict 
probabilistic  sense  the  M^,  ...,  corresponding  to  the  sampled 
primaries  are  really  random  variables  whose  realization  depends 
upon  which  primaries  are  sampled.  Hence,  the  arguments  used  above 
and  those  to  follow  in  Section  4 are  really  conditional  arguments 
for  given  values  of  M^,  ....  M^.  However,  since  the  unbiasedness 
of  the  estimators  of  Y and  the  confidence  levels  of  the  corresponding 
confidence  intervals  will  not  depend  upon  the  values  of  M^,  ....  M^, 
these  properties  will  also  apply  in  an  unconditional  sense. 
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3.3  Confidence  Intervals  on  Y 


The  problem  of  estimating  a linear  combination  of  variance 
components  such  as  V(Y  - Y^)  is  much  more  difficult  for  the  unbalanced 
case,  i.e.,  all  nu  not  equal,  than  for  the  balanced  case  considered 
in  Section  2.2.  Searle  [1971b]  cites  two  problems  in  the  unbalanced 
c^se;  namely,  several  methods  of  estimation  are  available  with  no 
clear  decision  on  which  is  best,  and  all  methods  involve  cumbersome 
algebra.  Searle  [1971a,  1971b]  gives  an  extensive  survey  of  various 
methods  to  estimate  variance  components.  Perhaps  the  most  popular 
method  is  the  analysis  of  variance  method  suggested  by  Henderson 
11953],  which  for  the  model  assumed  in  (3.1)  - (3.3),  yields  the 
unbiased  estimators 


m, 

n 1 - 2 
] 

e m -n 

o 


(3.39.) 


Em  (y  .-y)2-(n-l)o2 
2 i 1 


(3.40) 


An  unbiased  estimator  for  V(Y  - Y^)  would  therefore  be 


g = b,o2  + b„o2 
u la  2 e 


(3.41) 


where 


tit  7 


ec-wjt 


Hi  Mii^l 
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J Nn2  2 2 n 2 2 

b,  = ~r  ■ Z MT  + M (1  - n)  Ek7mZ 

1 M2  i 1 ° i 1 1 

o v 


(3.42) 


1 n Mi 

= ? ' E m + M^(1 

2 M2  i mi  ° 
o 


2tt  + (1  - ti)  [M  (1  - 7r)Zk.m. 

0 i 1 1 


+ 2Ski(M1  - m.)  ]) 
i 


(3.43) 


However,  as  Searle  [1971a,  1971b]  reports,  the  distributions  of  the 
variance  component  estimators  are  unknown,  although  they  may  be 
expressed  as  linear  combinations  of  non- central  chi-squares  as 
discussed  by  Harville  [1969].  The  only  partial  exception  is  that 
under  the  assumption  that  the  random  effects  have  normal  distributions. 


A A u 

2 _ 9 e 

a ^ X7  \ • 

e (m  -n)  m -n 
o o 


(3.44) 


Obviously  since  the  distributions  of  the  variance  component  estimators 


themselves  are  unknown,  the  distribution  of  a linear  combination 

such  as  g is  also  unknown, 
u 

One  method  to  approximate  the  distribution  of  g^  is  to  equate  its 
first  two  moments  to  those  of  a chi-square  variable.  That  is,  let 


8u  * b*(f)  * 


(3.45) 


•4 


V(g  ) = b,V(a")  + b,V(o‘)  + 2b  b Co v(ff 


Solving  for  b and  f gives 


br"V(a  )+b^V(o  )+2b1b_  Cov(a 


b,V(a  )+b^V(a  )+2b  b Cov(a 


For  the  model  assumed  in  (3.1)  - (3.3)  Searle  [1971a]  gives  formulas 


for  V(a  ),  V(a  ),  and  Cov(o 


o ) under  the  normality  assumptions 


However,  since  these  terms  are  functions  of  squares  and  products  of 


he  further  cites  results  derived  by  Ahrens  which  provide 


unbiased  estimators  for  them.  At  best  the  above  procedure  would  lead 


to  only  an  approximate  distribution  of  g , and  since  estimators  of 


unknown  parameters  are  used  when  solving  for  b and  f,  the  method 


Since  the  distribution  of  g is  not  available  for  the  unbalanced 


case,  it  appears  the  simplifying  assumption  that  all  m.  = m must  be 


ft.  :r  ■ *■+. 


as  an  unbiased  estimator  of  V(Y  - Y ).  Furthermore 


An  approximate  100(l-a)%  confidence  interval  on  Y is  therefore 


An  exact  confidence  interval  on  Y may  also  be  constructed 


Consider  the  random  variable 
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liquating  V(Uj)  to  V(Y  - Y^)  in  (3.56)  and  solving  for  c2  and  c2.c^ 
yie Ids 


2 1 rN'n  2 
c = -=  { l M 


2 “ 2 2 


i = ~~2  ( + M (1  — lO  Eklmf?  (3.71) 

M i ° i11J 


C2iC3i 


2 2 
-c  n M. 

— + At  (E  — + M (1  - 2tt  + (1  - tt)  [M  (1  - it)- 

mi  M2  i "i  ° 
o 


E k^nu  + 2Ek^ (M^  - m )])}. 
i i 


(3.72) 


-vt 

y—  (n*  - 1) 


(3.73) 


where 


n*  - 2 
E (u  - u) 

i 1 

8e  "—ST— 

n - 1 


(3. 74) 


n*  = number  of  sampled  primaries  with  c2jc3^ 


and  an  exact  100(1  - a)%  confidence  interval  on  Y is 


lYU  - ta/2;n*-l*/®e1. 


(3.75) 


As  in  the  equal  primary  case,  any  convenient  set  of  l..  may  be  used  as 
mi  2 1J 

long  as  Z 9,  =0  and  c„.c„.  > 0 and  satisfies  (3.72). 

ii  2i  3i  — 

J 


Hie  i*  7 Ji 


I 
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3.4  Robustness  of  Y^  to  Model  Breakdown 


The  robustness  of  Y^  to  model  breakdown  is  now  examined.  Recall 


where 


YU  = N yu  + (1  ' n)ya 


yu  = J Vi  * 

o i 


(3. 7n) 


(3.77) 


ya  = f iVi  * 


(3.78) 


m -mf 

k.  --2-i 

1 2 " 2 
m -Zm, 

° i 1 


(3.79) 


(3.30) 


Let  denote  that  primaries  are  selected  with  equal  probabilities 
and  without  replacement.  Let  derote  that  primaries  are  drawn 
with  probability  proportional  to  size  and  with  replacement.  In 
both  and  it  is  assumed  that  the  secondaries  are  sampled  at 
random.  Then  for  a given  finite  population, 


E<^|Si>  - S v + SjVj 


(3.81) 


U 

H 


KC^Isp  - i EMjv,  + 


(3.82) 


<prr  .— 
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Thus , for  a given  finite  population  Y^  is  a biased  estimator  of  Y 


under  either  or  even  though 


E(yu|s1)  = Y 


(3.83) 


and,  if  all  nr  are  equal. 


E(yJS2)  = Y 


(3. 84) 


For  the  special  case  when  only  one  primary  is  selected,  i.e.,  n = 1, 


YU  = yi  ’ 


(3.85) 


EOgs2>  = Y , 


(3.86) 


and  hence  E(Y  - Y^)  = 0 for  any  assumed  super-population.  This 


special  case  occurs  when  a stratified  multi-stage  design  is  used 
and  only  one  primary  is  chosen  from  each  strata. 

Another  special  case  of  interest  is  stratified  sampling,  i.e., 
when  all  primaries  are  sampled.  In  this  case,  u = 1,  and  if 


secondaries  are  sampled  at  random  from  each  primary 


N 


Y..  = 


fiyi 


U M 


(3.87) 


\ is  the  classical  unbiased  estimator  of  Y.  Hence,  the  unbiasedness 


of  Y In  stratified  sampling  is  robust  to  model  breakdown.  This 


U 


resultwas  also  obtained  by  Hartley  and  Sielken  [1975]  who  further 
discuss  the  robustness  of  the  confidence  interval  for  stratified 


Ml 


ItfejU 


■ 
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, 

- 

sampling. 

t • The  results  of  Section  2.5  concerning  selection  of  the 

also  apply  to  the  unequal  primary  case. 


3.5  Comparison  of  Y^  with  the  BLUE 

The  estimator  Y^  was  motivated  by  the  least  squares  estimators 
of  the  finite  population  parameters.  An  alternative  unbiased 

A __  A 

estimator  Y for  Y can  be  constructed  by  minimizing  V(Y  - Y)  subject 
to  the  restriction  that 

E(Y  - Y)  =0  . (3.88) 


l 


c.  n 

In  particular,  let  Y = Lh.y  where  the  h.'s  do  not  depend  on  the 

i 1 

y,  . ' s . Then 
ij 


n 


E(Y  - Y)  = y (1  - Ih.) 

J 


(3.89) 


and  the  condition  that 


(3.90) 


is  imposed  to  insure  that  Y is  unbiased.  Minimizing  V(Y  - Y)  with 
respect  to  h^  subject  to  (3.90)  yields 


Hence,  Che  estimator  Y 


is  the  BLUK  of  Y.  This  estimator  has  been  previously  suggested  by 


cott  and  Smith  [1969].  However  Y requires  knowledge  of  o 


md  these  values  are  seldom  known  in  practice.  Scott  and  Smith  make 


the  simplifying  assumption  that  all  m.  = m,  and  recommend  estimating 


Notice  that  in  the  special  case  when  M.  = M and  m 


If  m = m but  not  all  are  equal,  then 


_ 1 _ _X_ 
’i  n M 


n 

EM. 

^-Bl 


(3. “SO 


and 


YS  = Vr  + (1  " "s)y 


(3.100) 


where 


n 

XEM 

1 1 
M 


(3.101) 
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r 

h I 


t r 


'■ 


r 


* J 

* ^ 

r h 

- & 

'5 


and 


n _ n 
y = EM  y /EM 
i i 


nm 

ZLylj 

y = 


(3.io::.- 


(3.103) 


Die  estimator  Y^  in  (3.26)  is  therefore  a special  case  of  Yg  in 
which  X = 1. 

Even  though  Y minimizes  V(Y  - Y) , Y has  the  advantage  that 

b U 

under  the  assumed  model,  its  distribution  and  the  distribution  of 
its  estimated  variance  are  known,  and  confidence  intervals  can 
therefore  be  constructed  on  Y . On  the  other  hand,  Scott  and  Smith 
do  not  derive  the  distribution  of  Yg  when  X is  unknown,  and  therefore 
make  no  comments  concerning  confidence  intervals.  Further,  X will 


J 

m fi 


tm. 


....  . ■ 'An-..  ■ • _t 
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2 2 

indeed  be  close  to  one  if  o /ra  is  small  relative  to  a , as  would  be 

e a 

tlie  case  if  there  is  a clustering  effect  within  primaries.  As  stated 
in  Section  2.3,  this  type  of  survey  population  is  quite  common. 
Finally,  if  m is  large  and  the  are  of  relatively  equal  size,  the 
difference  between  V(Y  - Y ) and  V(Y  - Y ) will  be  small.  Hence, 

l)  U 

although  Y is  the  BLUE  of  Y,  the  fact  that  its  distributional 
properties  are  not  known  when  X is  unknown,  suggests  that  Y^  is  of 
greater  practical  value. 


4.1  Definition  of  the  Model 


There  are  many  situations  in  survey  sampling  in  which  the 
variable  of  interest,  y,  is  related  to  another  variable,  x.  If  such 
a situation  occurs,  a more  appropriate  super-population  model  than 
the  one  assumed  in  Section  3.1  is 


eij  % N(0*  °c)  * 


(4.3) 


a^,  x^,  and  are  all  independent,  and  p and  £ are  constants. 

An  estimator  for  the  finite  population  mean  is  developed  under 

the  assumptions  (4.1)  - (4.3)  for  the  general  case  when  primaries 

2 2 2 2 

are  of  unequal  size  and  neither  p = °a/°c  nor  °a  and  are  known. 
The  notation  used  here  is  consistent  with  that  of  Section  3.  In 


addition,  X is  a (M^xl)  vector  of  the  x_'s  in  the  finite  population, 

and  X is  a (m  *1)  vector  of  the  x. ,'s  in  the  sample. 

-so  ij  ^ 

4.2  Estimation  and  Variance  Formulas  for  the  Finite  Population  Mean 


The  finite  population  can  be  expressed  as 


y±.  = ai  + bxij  + e^  , i = 1 N , 

j — 1»  ...»  M , , 


(4.4) 


4.  REGRESSION  ESTIMATORS  IN  TWO-STAGE  SAMPLING 


where 


y. . = u + a.  + £x. . + e 
ij  i 1 J ij 


'v  N(0,  aa)  , 


(4.1) 


(4.2) 
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where 


m M- 
N 1 

l £(x. .-x. )y  . 
t ■ 11  i ij 

N Mi 

£ E(x. ,-x.) 

iJ  13  1 


(4.5) 


and 


M. 

l 

£ y,  , 


a.  = — 

i M. 


ij 


- b 


M. 

l 

E 


x.  . 
ij 


*i 


i 1,  • « • i N , 


(4.6) 


are  the  least  square  coefficients  for  the  regression  of  y . on  x 

ij  ij 

The  mean  of  the  finite  population  can  be  represented  as 

N 


Y = 


EM.Y. 

. l l 
l 


N 

EM. (a.+bX. ) 

.it  l 
l 

M 


(4.7) 


An  estimator  of  Y can  therefore  be  constructed  by  estimating  b and  a. 
from  the  sample.  Performing  the  regression  of  y„  on  x for  the 
sample  yields 

m. 
n i 


b = ^ 


E £ (x. .-x  )y . . 
ij  i ij 


m. 

n l _ 

E E (x. ,-x. ) ' 
. . ij  i 

i J 


(4.8) 


and 


ai  = Yi  - bxt  , 


-1-  1)  • • * y II  ) 


(4.9) 
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where 


(4.10) 


(4.11) 


As  an  estimator  of  a^  for  the  primaries  not  selected  in  the 
sample,  the  knowledge  that  the  conditional  expectation  of  given 
any  X is  |i,  suggests  using  an  unbiased  estimator  of  y.  Following 
Section  3.2,  let  the  estimator  for  a^  when  the  iCh  primary  is  not 
selected  in  the  sample  be 


where 


a.  = y - bx 
i a a 


y = Ek.m.y.  , 
a . i i x 
x 


x = Ek . m.  x.  , 
a . i i 1 
x 


m -m. 
o x 

2 L2 

m -Em. 
° i 1 


Substituting  b and  a^  from  (4.8),  (4.9),  and  (4.12)  for  b and 


a.  in  (4.7)  gives 


Obviously,  to  use  this  estimator,  X must  be  known.  It  is  of  interest 
to  note  that  when  n - N,  ir  - 1,  and  YR  simplifies  to  the  classical 
combined  regression  estimator  for  stratified  sampling,  namely. 


Y can  be  expressed  as 


Noting  that 


E(Y±  |x,  X,)  = M.u  + eE  x , 

j J 

mi 

E(yt |X.  Eg)  = ®1M  + B Z x , 

j 3 

and 

E(b|x,  Xj  = 6 , 

it  follows  that 

E(Y  - YR|X,  Xg)  = 0 . 

Also, 

v(Yi  I — » *,>  = Mi°a  + Ve  * 
VCyJX,  Xg)  = m 1*1  + m±ol  , 
Cov(Y1,  y±|X,  3^)  = miMia^  + m±o*  , 

A 

Cov(Y  , b|X,  X ) ■=  0 , 

i ' — — s 

Cov(y  , b |X,  X ) - 0 , 

1 — s 


(4.25) 

(4.26) 

(4.27) 

(4.28) 

(4.29) 

(4.30) 

(4.31) 

(4.32) 

(4.33) 
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If  in  addition,  all  M,  = M 


4.3  Confidence  Intervals  on  Y 


In  constructing  a confidence  interval  on  Y,  it  is  assumed 


that  all  m = m.  Through  the  use  of  Theorem  2.1  of  Section  2.2 


is  an  unbiased  estimator  for  V(Y 


In  addition,  the 


conditional  distribution  of  (Y 


approximate  t-distribution  with  n'  degrees  of  freedom  where 


Hence,  only  X and  the  observed  x..’s  in  the  sample  are  needed  to 


completely  specify  the  t-distribution,  and  an  approximate  100(l-a)% 


confidence  interval  on  Y is 
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[YK  ' ta/2;n'  ^ ] ' 


Note  that  the  confidence  interval  (A. 45)  depends  on  X and  X 


since  both  g and  n'  depend  on  X and  X^.  However,  since  the 


confidence  level,  100(l-a)%,  does  not  depend  on  either  X or  X , the 


above  confidence  interval  procedure  produces  intervals  containing 


Y 100(l-a)%  of  the  time. 


An  exact  confidence  interval  for  Y cannot  be  constructed  using 


the  technique  of  Section  2.2.  If  u^  were  defined  to  be 


Ui  = Clyi  + C2di 


where 


‘i  - 


“u  - 0 - 

J 


ECuJx,  X,)  = cxy  + B(c1xi  + 


and  the  u 's  would  not  have  the  same  mean.  Therefore  L (u  - u) ' 

i 1 


given  X and  X would  not  be  a central  chi-square,  and  hence  an 


exact  t-dis tribution  would  not  be  available. 
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4.4  Two-Stage  Sampling  When  Elements  are  Related  to  Primary  Size 


An  important  special  case  of  the  model  in  (4.1)  is  when  = fh  , 
i.e.,  an  element  in  the  population  is  related  to  the  size  of  its 
primary.  A similar  model  has  been  considered  by  Cochran  [1963]  to 
study  the  behavior  of  estimators  for  single-stage  cluster  sampling 
designs.  The  super-population  is  now  defined  as 


where 


y..=y  + ot.  +BM.  + £.. 
ij  i i ij 


a.  'u  N(0,  a ) , 
l a 


(4.50) 


(4.51) 


eij  ^ N(0,  a^)  , 


(4.52) 


M.  , cr,  and  e are  all  independent,  and  y and  B are  constants.  The 
finite  population  can  be  expressed  as 


where 


y. . = a + bM.  + e. . 
ij  i iJ 


(4.53) 


a = Y - b 


Z EM  y . .-YEM. 
1 j 1 lj  i 1 
'N  1 2 

N , ?M i 

em3--V^- 

1 1 Mo 


The  finite  population  mean  is  now 


(4.54) 


(4.55) 


I 


A logical  estimator  for  Y is 


where 


are  the  least  square  coefficients  for  the  sample  regression  of 


y on  M. . Simplifying  (4.57)  gives 


Noting  that 


(4.7C 


s2  mM 


M 


- m + 2A 


mn  mB  °1  ’ 


N-n 


[ Mi  (1-tt)(tt)Mo 


M 


(4.71 


n ttV 

B = LM2 * 

i 1 n 


(4.7; 


and  n'  is  defined  in  (4.42). 


4.5  Robustness  of  Y to  Model  Breakdown 

K 


The  robustness  of  Y to  model  breakdown  is  now  discussed. 
K 


Notice  that  Y can  be  written  as 
R 


Y + b[X  - (irx  + (1  - tt)x  )] 
U r a 


(4.7: 


where  Y^  is  defined  in  (3.26).  As  was  shown  in  Section  3.4,  for  a 


given  finite  population  Y^  is  a biased  estimator  of  Y.  It  therefore 


follows  that  Y is  also  biased  for  Y given  a fixed  finite  population 
K 


Furthermore,  in  the  special  case  of  stratified  sampling  where 


Y is  the  classical  combined  regression  estimator,  Y remains  biased 
R R 
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5.  EXTENSIONS  AND  CONCLUSIONS 


5-1  Sampling  with  More  Than  Two  Stages 
5.1.1  A p-stage  sampling  design 

The  methodology  developed  in  this  report  can  be  extended 
in  a straightforward  manner  to  higher  levels  than  the  two-stage 
designs  considered  previously.  For  a p-stage  design,  the  linear 
model  describing  the  super-population  is  assumed  to  be  the  random 
p-fold  nested  classification  model.  It  is  further  assumed  that  all 
random  factors  are  normally  distributed  with  mean  zero  and  constant 
variance,  and  that  all  factors  are  independent. 

If  all  sampling  units  are  of  equal  size,  the  least  squares 
technique  used  in  Section  2.1  can  be  used  to  construct  estimators 
for  the  finite  population  parameters.  If  sampling  units  are  of 
unequal  size,  as  in  Section  3.2,  estimators  can  be  constructed  by 
using  the  knowledge  that  non-sampled  units  are  from  a distribution 
defined  by  the  super-population  model. 

If  equal  samples  per  sampling  unit  are  selected,  an  approximate 
confidence  interval  on  Y can  be  constructed  by  using  the  following 
result  from  variance  components  analysis  (see,  e.g.,  Graybill  [1961, 
p.  347]). 

Theorem  5.1:  Let  the  random  p-fold  nested  classification  model 

hold  for  the  super-population  and  let  all  random  variables  in  the 


model  be  independent  and  normally  distributed.  Let  the  mean  square 

(or  the  i*'1  factor  in  the  model  be  denoted  by  s?.  Let  s.  have  n. 

1 t i 

2 2 2 2 

decrees  of  freedom  and  let  E(s.)  = o . . Then  v.  = n.s./o.  'v  X/  \ 

l i i t l l (.n^; 

and  v , ....  v are  mutually  independent. 

P 


Us 


sing  Theorem  2.1  of  Section  2.2,  it  follows  that  g = Eg^s^ 


is  an  unbiased  estimator  of  y = Eg. a.,  and 

i 


Y X(n  ) 


(5.1) 


where 


F 2 2 
1 

Ecgfoj/^j 

i 


(5.2) 


By  selecting  the  g.  in  such  a manner  that  y = V(Y  - Y),  an 
approximate  100(l-a)%  confidence  interval  on  Y is 


[Y  ± t . ,/g  } 

a/2;n 


(5.3) 


An  exact  confidence  interval  on  Y may  also  be  constructed  by 
considering  appropriate  linear  contrasts  of  the  observations.  An 
example  for  the  three-stage  design  is  given  in  the  following 
section. 


5.1.2  Three-stage  sampling  with  equal  sizes  and  samples 


For  the  three-stage  sampling  design  let 


N = number  of  primaries 


1 


M = number  of 
L = number  of 
n = number  of 
m = number  of 
2 = number  of 


secondaries /primary , 
ter  ti aries /secondary , 
sampled  primaries, 

sampled  secondaries/sampled  primary,  and 
sampled  ter tiaries/sarapled  secondary. 


The  three-fold  nested  classification  model  describing  the  super- 
population is 


where 


yijk  " “ + ^ijk 


(5.4) 


and 


E(nijk  ni'j'k,) 


E(n. .,)  = o 

tjk 


2 . 2 ^ 2 
a + a„  + a 
u g e 

2 2 
0 + i 

a g 
2 

°a  ’ 


0 , 


i=i'»  j=j',  k=k ' , 
i=i',  j=j',  k^k', 

i-i’,  J*J'. 
i*i'  • 


(5.5) 


(5.6) 


Using  the  methodology  of  Section  2.1, 

nmi 

{ . Uk  1Jk 

2 mn 

and 


(5.7) 


V(Y  - Y)  = (1  - J)  + -(l-~ ) + ~(1  - fjjg) 

n N mn  MN  C,mn  LMN 


(5.8) 


Alternatively,  V(Y  - Y)  can  be  expressed  as 


•v 


where 


Using  Theorem  5.],  an  unbiased  estimator  for  V(Y 


where 


nm£ 

iK(y.-y) 
2 - 1 


samples  from  a super-population  and  not  a fixed  finite  population. 
Trueblood  and  Cyert  [1957]  offer  an  example  of  a two-stage  sample 
design  used  to  confirm  accounts  receivable  in  a department  store. 


Over  a period  of  time,  the  account  ledgers  change,  and  the  notion 
of  a super-population  generating  a fixed  set  of  ledgers  at  a given 
point  in  time  seems  to  be  an  improvement  over  the  classical 
assumptions.  The  results  developed  in  this  report  are 
appropriate  when  sampling  from  such  a population. 

Possibilities  for  future  research  in  the  super-population 
theory  of  survey  sampling  include 

(i)  the  study  of  ratio  estimators  in  both  single  and  multi- 
stage designs, 

(ii)  an  Empirical  Bayes  approach  for  estimation  of  the  finite 
population  parameters, 

(iii)  the  further  study  of  estimation  of  finite  population 
parameters  in  surveys  with  more  than  two  stages  and 
sampling  units  of  unequal  size,  and 

(iv)  an  extension  of  the  results  of  Section  4 to  the  multi- 


variate case  where  x. , 

ij 


is  a vector  rather  than  a single 


variable . 


r.jr  "Stop*”- 


T 
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Royall  and  Herson  [1973a]  have  considered  a special  case  of  a 
ratio  estimator  in  single  stage  designs  where  the  super-population 
model  is  a polynomial  regression  and  have  found  it  to  be  the  BLUE  of 
the  finite  population  total.  Further  research  into  the  behavior  of 
ratio  estimators  in  multi-stage  surveys  may  produce  similar  results. 

Since  super-population  theory  implies  the  parameters  of  the 
finite  population  are  truly  random  variables,  it  might  be  worthwhile 
to  consider  an  Empirical  Bayes  approach  to  the  problem  of  estimating 
the  finite  population  parameters.  An  Empirical  Bayes  approach  would 
utilize  information  from  previous  surveys  of  the  population  to 
estimate  the  current  finite  population  parameters. 

In  extending  the  methodology  developed  in  this  report  to 
more  than  two  stages  when  sampling  units  are  of  unequal  size, 
various  estimators  for  non-sarapled  elements  are  plausible  and  these 
alternatives  should  perhaps  be  studied. 

An  appropriate  super-population  model  for  extending  the  results 
of  Section  4 to  the  multivariate  case  is 

yij  = y + a±  + xJ^jJ  + e1J  (5.45) 

where  x,  . is  a vector  of  variables  and  8 is  a vector  of  constants. 
Using  matrix  notation  and  the  arguments  of  Section  4,  estimators  for 
the  multivariate  case  could  be  developed. 

In  conclusion,  a super-population  theory  to  survey  sampling  is 
quite  realistic  and  brings  survey  sampling  more  into  the  mainstream 
of  statistical  inference  than  does  the  classical  theory  depicted 
as  Case  1 in  Table  1 (p.  7).  Future  research  will  hopefully  lead  to 
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APPENDIX  A:  A NUMERICAL  EXAMPLE  FOR  TWO-STAGE  SAMPLING 

WITH  EQUAL  SIZES  AND  SAMPLES 

The  following  example  is  taken  from  Mendenhall  et  al.  [1971, 
p.  186]  with  the  modification  that  all  primaries  are  of  equal  size. 

A forester  wants  to  estimate  the  total  number  of  trees  in  a 
certain  county  which  are  infected  with  a particular  disease.  A 
two-stage  sampling  design  is  used  with  the  county  being  divided 
into  ten  primary  units  which  are  further  subdivided  into  fifteen 
secondaries  of  approximately  equal  size.  A sample  of  four  primaries 
and  six  secondaries  per  sampled  primary  is  taken.  The  survey  results 
are  given  in  Table  3. 

TABLE  3 


Results  of  Forester's  Survey 


Area 

Number  of  Infected  Trees 

per 

Plot  (y  ) 

1 

15, 

14, 

21, 

13, 

9, 

10 

2 

4, 

6, 

10, 

9, 

8, 

5 

3 

10, 

11, 

14, 

10, 

9, 

15 

4 

8, 

3, 

4, 

1, 

2, 

5 

For  this  problem  N = 10,  M = 15,  n = 4,  and  m = 6.  From  (2.14) 


f = 2M=  9 . 
E 24 


w 


Also,  (2.28),  (2.29),  and  (2.35)  give 

s2  = 117.444  , 

D 

s2  = 8.983  , 

w 

and 

g = 3.026  . 

The  degrees  of  freedom  given  in  (2.37)  associated  with  the 
approximate  confidence  interval  are 


n’  = 3.186  . 

Since  n'  is  not  an  integer,  it  is  rounded  down  to  3,  and  a 
conservative  95%  confidence  interval  on  Y from  (2.43)  is 
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Also,  from  (2.55),  (2.56),  and  (2.59), 


C2C3 


.15  , 

.01  , 


and 


g = 2.493 


Hie  exact  95%  confidence  interval  on  Y from  (2.64)  is 


(3.979,  14.02)  . 
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